小波散射变换创造了几何不变和变形稳定性。在多个信号域中,与其他非学习表示形式相比,它可以产生更多的判别性表示,并且在某些任务中,尤其是在有限的标记数据和高度结构化的信号中,它都超越了学习的表示。通常选择散射转换中使用的小波过滤器,以通过参数化的母小波创建紧密的框架。在这项工作中,我们研究了这种标准的小波滤网构造是否最佳。为了关注Morlet小波,我们建议学习过滤器的量表,方向和纵横比,以产生散射变换的特定问题参数化。我们表明,我们学到的散射转换版本在标准散射变换上在小样本分类设置中产生了显着的性能增长。此外,我们的经验结果表明,传统的滤纸结构对于提取有效表示的散射转换可能并不总是必要的。
translated by 谷歌翻译
t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.
translated by 谷歌翻译
Position modeling plays a critical role in Transformers. In this paper, we focus on length extrapolation, i.e., training on short texts while evaluating longer sequences. We define attention resolution as an indicator of extrapolation. Then we propose two designs to improve the above metric of Transformers. Specifically, we introduce a relative position embedding to explicitly maximize attention resolution. Moreover, we use blockwise causal attention during inference for better resolution. We evaluate different Transformer variants with language modeling. Experimental results show that our model achieves strong performance in both interpolation and extrapolation settings. The code will be available at https://aka.ms/LeX-Transformer.
translated by 谷歌翻译
A true interpreting agent not only understands sign language and translates to text, but also understands text and translates to signs. Much of the AI work in sign language translation to date has focused mainly on translating from signs to text. Towards the latter goal, we propose a text-to-sign translation model, SignNet, which exploits the notion of similarity (and dissimilarity) of visual signs in translating. This module presented is only one part of a dual-learning two task process involving text-to-sign (T2S) as well as sign-to-text (S2T). We currently implement SignNet as a single channel architecture so that the output of the T2S task can be fed into S2T in a continuous dual learning framework. By single channel, we refer to a single modality, the body pose joints. In this work, we present SignNet, a T2S task using a novel metric embedding learning process, to preserve the distances between sign embeddings relative to their dissimilarity. We also describe how to choose positive and negative examples of signs for similarity testing. From our analysis, we observe that metric embedding learning-based model perform significantly better than the other models with traditional losses, when evaluated using BLEU scores. In the task of gloss to pose, SignNet performed as well as its state-of-the-art (SoTA) counterparts and outperformed them in the task of text to pose, by showing noteworthy enhancements in BLEU 1 - BLEU 4 scores (BLEU 1: 31->39; ~26% improvement and BLEU 4: 10.43->11.84; ~14\% improvement) when tested on the popular RWTH PHOENIX-Weather-2014T benchmark dataset
translated by 谷歌翻译
Workloads in modern cloud data centers are becoming increasingly complex. The number of workloads running in cloud data centers has been growing exponentially for the last few years, and cloud service providers (CSP) have been supporting on-demand services in real-time. Realizing the growing complexity of cloud environment and cloud workloads, hardware vendors such as Intel and AMD are increasingly introducing cloud-specific workload acceleration features in their CPU platforms. These features are typically targeted towards popular and commonly-used cloud workloads. Nonetheless, uncommon, customer-specific workloads (unknown workloads), if their characteristics are different from common workloads (known workloads), may not realize the potential of the underlying platform. To address this problem of realizing the full potential of the underlying platform, we develop a machine learning based technique to characterize, profile and predict workloads running in the cloud environment. Experimental evaluation of our technique demonstrates good prediction performance. We also develop techniques to analyze the performance of the model in a standalone manner.
translated by 谷歌翻译
With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group. Given the lack of clean training data, generative adversarial techniques are preferred to generate synthetic data with several state-of-the-art architectures readily available across various domains from unstructured data such as text, images to structured datasets modelling fraud detection and many more. These techniques overcome several challenges such as class imbalance, limited training data, restricted access to data due to privacy issues. Existing work focusing on generating fair data either works for a certain GAN architecture or is very difficult to tune across the GANs. In this paper, we propose a pipeline to generate fairer synthetic data independent of the GAN architecture. The proposed paper utilizes a pre-processing algorithm to identify and remove bias inducing samples. In particular, we claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples. Our experimental evaluation on two open-source datasets demonstrates how the proposed pipeline is generating fair data along with improved performance in some cases.
translated by 谷歌翻译
元加强学习(META-RL)是一种方法,即从解决各种任务中获得的经验被蒸馏成元政策。当仅适应一个小(或仅一个)数量的步骤时,元派利赛能够在新的相关任务上近距离执行。但是,采用这种方法来解决现实世界中的问题的主要挑战是,它们通常与稀疏的奖励功能相关联,这些功能仅表示任务是部分或完全完成的。我们考虑到某些数据可能由亚最佳代理生成的情况,可用于每个任务。然后,我们使用示范(EMRLD)开发了一类名为“增强元RL”的算法,即使在训练过程中获得了次优的指导,也可以利用此信息。我们展示了EMRLD如何共同利用RL和在离线数据上进行监督学习,以生成一个显示单调性能改进的元数据。我们还开发了一个称为EMRLD-WS的温暖开始的变体,该变体对于亚最佳演示数据特别有效。最后,我们表明,在包括移动机器人在内的各种稀疏奖励环境中,我们的EMRLD算法显着优于现有方法。
translated by 谷歌翻译
我们提出SERP,这是3D点云的自我监督学习的框架。 SERP由编码器编码器架构组成,该体系结构将被扰动或损坏的点云作为输入和旨在重建原始点云而无需损坏。编码器在低维子空间中学习了点云的高级潜在表示,并恢复原始结构。在这项工作中,我们使用了基于变压器和基于点网的自动编码器。所提出的框架还解决了基于变形金刚的掩盖自动编码器的一些局限性,这些框架容易泄漏位置信息和不均匀的信息密度。我们在完整的Shapenet数据集上训练了模型,并将它们作为下游分类任务评估。我们已经表明,审慎的模型比从头开始训练的网络实现了0.5-1%的分类精度。此外,我们还提出了VASP:对矢量定量的自动编码器,用于对点云进行自我监督的表示学习,这些学习用于基于变压器的自动编码器的离散表示学习。
translated by 谷歌翻译
通常使用卷积神经网络(CNN)进行计算机视觉。 CNN是计算密集型的,并且在移动和互联网(IoT)设备等电力控制系统上部署。 CNN是计算密集型的,因为它们不加选择地计算输入图像的所有像素上的许多特征。我们观察到,鉴于计算机视觉任务,图像通常包含与任务无关的像素。例如,如果任务正在寻找汽车,那么天空中的像素不是很有用。因此,我们建议对CNN进行修改以仅在相关像素上操作以节省计算和能量。我们提出了一种研究三个流行的计算机视觉数据集的方法,发现48%的像素无关紧要。我们还提出了重点卷积,以修改CNN的卷积层,以拒绝明显无关的像素。在嵌入式设备上,我们没有观察到准确性的损失,而推论潜伏期,能耗和倍增add计数均减少了约45%。
translated by 谷歌翻译
语言教学的挑战之一是如何以有意义的方式组织有关语言语法的规则。这不仅需要教学技能,而且还需要对该语言有深刻的了解。虽然开发此类课程的综合材料以英语和一些广泛的语言提供,但对于许多其他语言,教师需要手动创建它们来满足学生的需求。这个过程具有挑战性,因为i)要求这样的专家可以访问并拥有必要的资源,ii)即使有这样的专家,描述了一种语言的所有复杂性,这是耗时的,容易出现遗漏。在本文中,我们提出了一个自动框架,旨在通过自动发现和可视化语法各个方面的描述来促进这一过程。具体而言,我们从自然文本语料库中提取描述,该语料库回答有关形态句法(学习单词顺序,协议,案例标记或单词形成)和语义(学习词汇的学习)的问题,并显示了说明性示例。我们将这种方法用于教授印度语言,卡纳达语和马拉地语,这些方法与英语不同,它们没有发达的教学资源,因此很可能会从这项练习中受益。为了评估提取材料的感知效用,我们获得了北美学校的语言教育者的帮助,这些教育者教这些语言进行手动评估。总体而言,教师认为这些材料是他们自己的课程准备甚至学习者评估的参考材料有趣的。
translated by 谷歌翻译